Open Domain Generalization with Domain-Augmented Meta-Learning
基于领域增强元学习的开放领域泛化


Yang Shu*, Zhangjie Cao*, Chenyu Wang, Jianmin Wang, Mingsheng Long ( )
杨舒*,曹张杰*,王晨宇,王建民,龙明盛 ( )

School of Software, BNRist, Tsinghua University, China
中国清华大学软件学院,北京信息科学与技术国家研究中心

{shu-y18,caozj14,cy-wang18}@mails.tsinghua.edu.cn, {jimwang,mingsheng}@tsinghua.edu.cn

Abstract
摘要


Leveraging datasets available to learn a model with high generalization ability to unseen domains is important for computer vision, especially when the unseen domain's annotated data are unavailable. We study a novel and practical problem of Open Domain Generalization (OpenDG), which learns from different source domains to achieve high performance on an unknown target domain, where the distributions and label sets of each individual source domain and the target domain can be different. The problem can be generally applied to diverse source domains and widely applicable to real-world applications. We propose a Domain-Augmented Meta-Learning framework to learn open-domain generalizable representations. We augment domains on both feature-level by a new Dirichlet mixup and label-level by distilled soft-labeling, which complements each domain with missing classes and other domain knowledge. We conduct meta-learning over domains by designing new meta-learning tasks and losses to preserve domain unique knowledge and generalize knowledge across domains simultaneously. Experiment results on various multi-domain datasets demonstrate that the proposed Domain-Augmented Meta-Learning (DAML) outperforms prior methods for unseen domain recognition.
利用可用的数据集学习一个对未见领域具有高泛化能力的模型,这对计算机视觉至关重要,尤其是在未见领域的标注数据不可用时。我们研究了一个新颖且实用的开放领域泛化(OpenDG)问题,该问题从不同的源领域中学习,以在未知的目标领域上实现高性能,其中每个源领域和目标领域的分布和标签集可能不同。该问题通常可应用于不同的源领域,并广泛适用于现实世界的应用。我们提出了一个领域增强元学习框架来学习开放领域可泛化的表示。我们通过一种新的狄利克雷混合(Dirichlet mixup)在特征级别和通过蒸馏软标签在标签级别对领域进行增强,这可以用缺失的类别和其他领域知识来补充每个领域。我们通过设计新的元学习任务和损失函数在领域上进行元学习,以同时保留领域的独特知识并在不同领域间泛化知识。在各种多领域数据集上的实验结果表明,所提出的领域增强元学习(DAML)方法在未见领域识别方面优于先前的方法。

1. Introduction
1. 引言


Deep convolutional neural networks have achieved state-of-the-art performance on wide ranges of computer vision applications with access to large-scale labeled data [23,20 , 39, 19]. However, for a target domain of interest, collecting enough training data is prohibitive. A practical solution is to generalize the model learned on the existing data to the unseen domain. Since the existing source datasets for training may be from different resources, they may fall into different domains and hold different label sets, e.g., ImageNet [8] and DomainNet [36]. Besides, the target domain is totally unknown, and may also have a distribution shift and a different label set from the source domains. We call the valuable and challenging problem as Open Domain Generalization
深度卷积神经网络在能够获取大规模标注数据的情况下,在广泛的计算机视觉应用中取得了最先进的性能 [23,20 , 39, 19]。然而,对于感兴趣的目标领域,收集足够的训练数据是难以实现的。一个实用的解决方案是将在现有数据上学习到的模型泛化到未见领域。由于现有的用于训练的源数据集可能来自不同的资源,它们可能属于不同的领域并拥有不同的标签集,例如 ImageNet [8] 和 DomainNet [36]。此外,目标领域是完全未知的,并且可能与源领域存在分布偏移和不同的标签集。我们将这个有价值且具有挑战性的问题称为开放领域泛化

(OpenDG), where we need to learn generalizable representation from disparate source domains that generalizes well to any unseen target domain, as illustrated in Figure 1.
(OpenDG),在这个问题中,我们需要从不同的源领域中学习可泛化的表示,以便能够很好地泛化到任何未见的目标领域,如图 1 所示。




Figure 1. Open Domain Generalization (OpenDG). Different source domains hold disparate label sets. The goal is to learn generalizable representations from these source domains to help classify the known classes and detect open classes in the unseen target domain.
图 1. 开放领域泛化(OpenDG)。不同的源领域拥有不同的标签集。目标是从这些源领域中学习可泛化的表示,以帮助对未见目标领域中的已知类别进行分类并检测开放类别。


There are two key challenges for open domain generalization. (1) Distinct source domains and the unseen target domain are drawn from different distributions with a large distribution shift. (2) The different label sets of distinct source domains cause some classes to exist in many more domains than other classes. The data of minor classes existing in few domains are lacking in diversity. This makes the problem extremely difficult for existing methods [25, 29].
开放领域泛化有两个关键挑战。(1)不同的源领域和未见目标领域来自不同的分布,存在较大的分布偏移。(2)不同源领域的不同标签集导致某些类别在更多的领域中存在,而其他类别则不然。存在于少数领域中的少数类别的数据缺乏多样性。这使得现有方法 [25, 29] 很难解决这个问题。

To address the first challenge, previous works minimize the distribution distance between domains by adversarial learning [34,29] ,which successfully closes the domain gap when all source domains share the same label set. However, according to the second challenge, the different label sets between domains cause these distribution alignment methods to suffer from severe mismatch of classes. For the second challenge, a straightforward way is to manually sample data of minor classes existing in few domains, but the diversity in domains of the class is still limited. The generalization on the minor class is still inferior to other classes.
为了解决第一个挑战,先前的工作通过对抗学习最小化领域之间的分布距离 [34,29] ,当所有源领域共享相同的标签集时,这种方法成功地缩小了领域差距。然而,根据第二个挑战,领域之间不同的标签集导致这些分布对齐方法遭受严重的类别不匹配问题。对于第二个挑战,一种直接的方法是手动采样存在于少数领域中的少数类别的数据,但该类别的领域多样性仍然有限。少数类别的泛化能力仍然不如其他类别。

To generalize from arbitrary source domains to an unseen target domain, we propose a Domain-Augmented Meta-
为了从任意源领域泛化到未见目标领域,我们提出了一种领域增强元


*Equal contribution.
* 同等贡献。



Table 1. Comparison of the proposed generalization setting with the previous settings related to cross-domain learning. The columns list assumptions made by the problem settings. Note that more " x " means the method needs less assumption and thus is more widely-applicable. We can observe that the proposed open domain generalization problem requires no assumptions on the label set, no target data, and no post-training on target data,which is the most general problem setting. S means source while T means target. Note that "Same between S&T Domains" means the union of all source domain label sets equals the target label set, i.e., whether there are open classes.
表 1. 所提出的泛化设置与先前与跨领域学习相关的设置的比较。各列列出了问题设置所做的假设。请注意,更多的 " x " 意味着该方法需要的假设更少,因此应用范围更广。我们可以观察到,所提出的开放领域泛化问题对标签集没有假设,不需要目标数据,也不需要在目标数据上进行后训练,这是最通用的问题设置。S 表示源,而 T 表示目标。请注意,“S 和 T 领域相同” 意味着所有源领域标签集的并集等于目标标签集,即是否存在开放类别。

Problem SettingLabel SetTarget Data for TrainingPost-Training on Target Labeled Data
Same for S DomainsSame between S&T DomainsLabeled DataUnlabeled Data
Domain Adaptation [31, 32]
Domain Adaptation with Category Shift [35, 2, 51]
Multi-Source Domain Adaptation [55]
Multi-Source Domain Adaptation with Category Shift [50]
Domain Generalization [34]
Heterogeneous Domain Generalization [30]
The Proposed Open Domain Generalization
问题设定标签集训练目标数据目标标注数据的后训练
S领域相同科学与技术(S&T)领域之间相同标注数据未标注数据
领域自适应 [31, 32]
带有类别偏移的领域自适应 [35, 2, 51]
多源领域自适应 [55]
带有类别偏移的多源领域自适应 [50]
领域泛化 [34]
异构领域泛化 [30]
所提出的开放领域泛化


Learning (DAML) framework. To close the domain gap between disparate source domains, we avoid distribution matching but learn generalizable representations across domains by meta-learning. To overcome the disparate label sets of open domain generalization, we propose two domain augmentation methods at both feature-level and label-level. At feature-level, we design a novel Dirichlet mixup (Dir-mixup) to compensate for the missing labels. At label-level, we utilize the soft-labeling distilled from other domains' networks to transfer the knowledge of other domains to the current network. DAML learns a representation that embeds the knowledge of all source domains and is highly generalizable to the unseen target domain. We use the ensemble of all source domain network outputs as the final prediction, which naturally calibrates the predictive uncertainty. In summary:
学习(DAML)框架。为了缩小不同源域之间的领域差距,我们避免进行分布匹配,而是通过元学习来学习跨领域的可泛化表示。为了克服开放领域泛化中不同的标签集问题,我们在特征级别和标签级别分别提出了两种领域增强方法。在特征级别,我们设计了一种新颖的狄利克雷混合(Dir - mixup)方法来弥补缺失的标签。在标签级别,我们利用从其他领域网络中提炼出的软标签,将其他领域的知识转移到当前网络。DAML学习一种嵌入所有源域知识的表示,并且能够高度泛化到未见的目标域。我们使用所有源域网络输出的集成作为最终预测,这自然地校准了预测的不确定性。总结如下:

  • We propose a new and practical problem: Open Domain Generalization (OpenDG), which learns from arbitrary source domains with disparate distributions and label sets to generalize to an unseen target domain.
  • 我们提出了一个新的实际问题:开放领域泛化(OpenDG),它从具有不同分布和标签集的任意源域中学习,以泛化到未见的目标域。

  • We propose a principled Domain-Augmented Meta-Learning (DAML) framework to address open domain generalization. We augment each domain with novel Dir-mixup and distilled soft-labeling to overcome the disparate label sets of source domains and conduct meta-learning across augmented domains to learn open-domain generalizable representations.
  • 我们提出了一个有原则的领域增强元学习(DAML)框架来解决开放领域泛化问题。我们使用新颖的狄利克雷混合和提炼的软标签对每个领域进行增强,以克服源域不同的标签集问题,并在增强后的领域上进行元学习,以学习开放领域可泛化的表示。

  • Experiment results on several multi-domain datasets show that compared to previous generalization methods, DAML achieves higher classification accuracy on both known classes and open classes in an unseen target domain even with extremely diverse source domains.
  • 在多个多领域数据集上的实验结果表明,与之前的泛化方法相比,即使源域极其多样化,DAML在未见目标域的已知类别和开放类别上都能实现更高的分类准确率。

2. Related Work
2. 相关工作


In this section, we briefly discuss works related to ours, including domain adaptation, domain generalization, and data augmentation methods. We compare our problem setting with the problem settings of previous works in Table 1.
在本节中,我们简要讨论与我们的工作相关的研究,包括领域自适应、领域泛化和数据增强方法。我们在表1中比较了我们的问题设置与先前工作的问题设置。

Domain Adaptation aims to adapt the model from the source domain to the target domain, which typically mitigates the domain gap by minimizing the distribution distance [14,32] . However,the classic domain adaptation requires the same label set between source and target domains. Recent works try to extend domain adaptation to varied source and target label sets [2,35,41,51] ,but the solution relies on the target unlabeled data, which is not available in the open domain generalization setting.
领域自适应旨在使模型从源域适应到目标域,通常通过最小化分布距离[14,32]来缩小领域差距。然而,经典的领域自适应要求源域和目标域具有相同的标签集。最近的工作试图将领域自适应扩展到不同的源域和目标域标签集[2,35,41,51],但该解决方案依赖于目标无标签数据,而在开放领域泛化设置中这些数据是不可用的。

Multi-source domain adaptation is more related to our work with more than one source domain. Most of the works assume that all the source domains share the same label set [55,36] ,which can be easily violated in practice since source domains may be drawn from different resources. DCN [50] moves a step forward to remove the constraint on the source label sets but still requires the union of source label sets to be the same as the target label set. We instead require no label set constraint and no target data for training.
多源领域自适应与我们的工作更相关,因为它涉及多个源域。大多数工作假设所有源域共享相同的标签集[55,36],但在实践中这很容易被打破,因为源域可能来自不同的资源。DCN [50]向前迈进了一步,去除了对源标签集的限制,但仍然要求源标签集的并集与目标标签集相同。相反,我们不要求标签集约束,并且训练时不需要目标数据。

Domain Generalization aims to learn a generalizable model with only source data to achieve high performance in an unseen target domain [22,34] ,which typically learns domain-invariant features across source domains [34, 16, 15,28,4,38,5] . When the different source domains hold different label sets, such learning causes mismatch of classes. CIDDG [29] can avoid the mismatching but still requires all the source and target domains to share the same label sets, or otherwise the low domain diversity of some classes makes it hard to learn domain-invariant features.
领域泛化旨在仅使用源数据学习一个可泛化的模型,以在未见的目标域中实现高性能[22,34],通常是学习跨源域的领域不变特征[34, 16, 15,28,4,38,5]。当不同的源域具有不同的标签集时,这种学习会导致类别不匹配。CIDDG [29]可以避免这种不匹配,但仍然要求所有源域和目标域共享相同的标签集,否则某些类别的领域多样性较低会使学习领域不变特征变得困难。

Meta-learning instead has the potential to learn from highly diverse domains. However, current meta-learning-based domain generalization methods still fail to consider different label sets of distinct source domains and the open classes in the target domain [25, 1, 10, 27]. Heterogeneous domain generalization [30,49] has a similar goal of learning generalizable representations, which targets a more powerful pre-trained model by learning from heterogeneous source domains of different label sets. However, it requires additional target labeled data to induce a category model, which cannot fit into the proposed open domain generalization problem.
相反,元学习有潜力从高度多样化的领域中学习。然而,当前基于元学习的领域泛化方法仍然未能考虑不同源域的不同标签集以及目标域中的开放类别[25, 1, 10, 27]。异构领域泛化[30,49]有一个类似的学习可泛化表示的目标,它通过从具有不同标签集的异构源域中学习来获得一个更强大的预训练模型。然而,它需要额外的目标有标签数据来诱导一个类别模型,这无法适用于我们提出的开放领域泛化问题。

Augmentation The statistical learning theory [45] suggests that the generalization of the learning model can be characterized by the model capacity and the diversity of training data. So data augmentation can improve generalization by increasing the diversity of training data. Basic augmentations including affine transformation, random cropping, and horizontal flipping are widely-used in image classification [6, 42, 24]. Recently, more advanced augmentations are proposed. Mixup [54,44,18] combines two samples linearly. Cutout [9] removes contiguous sections of input images. Cutmix [52] combines cutout and mixup by filling the Cutout part with sections of other image patches.
增强 统计学习理论[45]表明,学习模型的泛化能力可以通过模型容量和训练数据的多样性来表征。因此,数据增强可以通过增加训练数据的多样性来提高泛化能力。包括仿射变换、随机裁剪和水平翻转在内的基本增强方法在图像分类中被广泛使用[6, 42, 24]。最近,人们提出了更先进的增强方法。Mixup[54,44,18]线性地组合两个样本。Cutout[9]移除输入图像的连续部分。Cutmix[52]通过用其他图像块的部分填充Cutout部分,将Cutout和Mixup结合起来。

Algorithm 1 Training process of Domain-Augmented Meta-Learning (DAML)
算法1 领域增强元学习(Domain-Augmented Meta-Learning,DAML)的训练过程


Input: Source datasets D1,D2,,DS ,learning rates η and β ,Dir-mixup hyper-parameters αmax and αmin
输入:源数据集D1,D2,,DS,学习率ηβ,Dir-mixup超参数αmaxαmin

Initialize θs|s=1S
初始化θs|s=1S

while Not Converged do
当未收敛时

Sample a batch of data Btr={(x1,y1),(x2,y2),,(xS,yS)} from all source domains D1,D2,,DS .
从所有源领域D1,D2,,DS中采样一批数据Btr={(x1,y1),(x2,y2),,(xS,yS)}

for s=1,,S do Meta-training starts
对于s=1,,S 执行 元训练开始

αstr{αmax,αmin,s} Dir-mixup parameter for meta-training
αstr{αmax,αmin,s} 用于元训练的Dir-mixup参数

BsD-mix ={(zsD-mix ,ysD-mix )}Dir-mixup({αstr,Btr}) Obtain Dir-mixup according to Eqn. (3)
BsD-mix ={(zsD-mix ,ysD-mix )}Dir-mixup({αstr,Btr}) 根据公式(3)获得Dir-mixup

Bsdistill ={(xs,ysdistill )}{Gj|js,Fj|js,Btr} Obtain distilled soft-label according to Eqn. (4)
Bsdistill ={(xs,ysdistill )}{Gj|js,Fj|js,Btr} 根据公式(4)获得蒸馏软标签

Lstr{Gs(Fs(xs)),ys,Gs(zsD-mix ),ysD-mix ,ysdistill } using data in Btr,BsD-mix  ,and Bsdistill  According to Eqn (1)
Lstr{Gs(Fs(xs)),ys,Gs(zsD-mix ),ysD-mix ,ysdistill } 使用Btr,BsD-mix 中的数据,以及Bsdistill  根据公式(1)

θFs,Gs=θFs,GsηθLstr

Sample another batch of data Bobj ={(x1,y1),(x2,y2),,(xS,yS)} from all source domains D1,D2,,DS .
从所有源域D1,D2,,DS中再采样一批数据Bobj ={(x1,y1),(x2,y2),,(xS,yS)}

for s=1,,S do Meta-objective starts
对于s=1,,S,执行 元目标开始

αsobj {αmin,αmax,s} Dir-mixup parameter for meta-objective
αsobj {αmin,αmax,s} 元目标的狄利克雷混合(Dir-mixup)参数

BsD-mix ={(zsD-mix ,ysD-mix )}Dir-mixup({αsobj ,Bobj }) Obtain Dir-mixup according to Eqn. (3)
BsD-mix ={(zsD-mix ,ysD-mix )}Dir-mixup({αsobj ,Bobj }) 根据公式(3)获得狄利克雷混合(Dir-mixup)

Lsobj {Gs(Fs(xj))|js,yj|js,Gs(zsDmix),ysDmix} using data in Bobj  and BsDmix According to Eqn (2)
Lsobj {Gs(Fs(xj))|js,yj|js,Gs(zsDmix),ysDmix} 使用Bobj BsDmix中的数据 根据公式(2)

θFs,GsθFs,Gsβθ(Lstr+Lsobj) Update parameters with meta-learning
θFs,GsθFs,Gsβθ(Lstr+Lsobj) 用元学习更新参数

return θs|s=1S
返回θs|s=1S



Augmentation-based generalization methods promote the generalization ability by augmenting source data, where adversarial data augmentation [47], gradient-based perturbations [43], self-supervised learning signals [3], and Cut-Mix [33] are used as the augmentation method. Note that these augmentation methods target general situations for generalization across domains but are not designed specially for open domains with disparate label sets.
基于增强的泛化方法通过增强源数据来提升泛化能力,其中对抗性数据增强[47]、基于梯度的扰动[43]、自监督学习信号[3]和Cut - Mix[33]被用作增强方法。请注意,这些增强方法针对的是跨领域泛化的一般情况,并非专门为具有不同标签集的开放领域而设计。

Different from all previous works, this paper studies open domain generalization, a practical but challenging problem. We develop the DAML framework to conduct meta-learning over augmented source domains. We design a novel Dir-mixup to mix samples from multiple domains instead of mixing two arbitrary samples in classic mixup. Dir-mixup bridges all the source domains and compensates each domain with missing classes from other domains, which naturally fits the disparate source label sets. We further propose a new distilled soft-labeling to transfer knowledge across domains.
与以往所有工作不同,本文研究开放领域泛化这一实际但具有挑战性的问题。我们开发了DAML框架,以在增强的源领域上进行元学习。我们设计了一种新颖的Dir - mixup方法,用于混合来自多个领域的样本,而非经典mixup方法中混合任意两个样本。Dir - mixup连接了所有源领域,并利用其他领域的缺失类别对每个领域进行补偿,这自然适用于不同的源标签集。我们进一步提出了一种新的蒸馏软标签方法,用于跨领域传递知识。

3. Domain-Augmented Meta-Learning
3. 领域增强元学习


In this section, we first introduce the open domain generalization (OpenDG) problem. Then we introduce the Domain-Augmented Meta-Learning (DAML) and describe the step-by-step algorithm and the optimization of the framework, which consists of the proposed domain augmentation and the meta-learning on the augmented domains.
在本节中,我们首先介绍开放领域泛化(OpenDG)问题。然后我们介绍领域增强元学习(DAML),并详细描述该框架的逐步算法和优化过程,该框架由提出的领域增强和在增强领域上的元学习组成。

3.1. Open Domain Generalization
3.1. 开放领域泛化


In open domain generalization (OpenDG), we have multiple source domains D1,D2,,DS available for training, where each source domain s consists of data-label pairs Ds={(xs,ys)}.ys denotes the one-hot label of xs . Note that although we train the model with mini-batches in practice, here we omit the batch size of each domain to simplify the notations. We use C to denote the union of all the source label sets. In open domain generalization, we have no constraint on the label sets of different domains. We aim to learn open-domain generalizable representation from all the source domains, which can generalize well to an unseen target domain Dt . Specifically,the target domain,only used for evaluation,consists of fully unlabeled data Dt={xt} and its label set Ct may contain classes existing in any source label set or unknown classes not existing in the union of source label sets C . The goal is to classify at inference each target sample with the correct class if it belongs to the source label set C ,or label it as "unknown". Note that no target data,even unlabeled, are available for training, which differs OpenDG from domain adaptation [51] or domain generalization [49].
在开放领域泛化(OpenDG)中,我们有多个源领域D1,D2,,DS可用于训练,其中每个源领域s由数据 - 标签对Ds={(xs,ys)}.ys组成,Ds={(xs,ys)}.ys表示xs的独热编码标签。请注意,虽然在实践中我们使用小批量数据来训练模型,但为了简化符号,这里我们省略了每个领域的批量大小。我们使用C表示所有源标签集的并集。在开放领域泛化中,我们对不同领域的标签集没有限制。我们的目标是从所有源领域中学习开放领域可泛化的表示,该表示能够很好地泛化到一个未见过的目标领域Dt。具体来说,目标领域仅用于评估,由完全无标签的数据Dt={xt}组成,其标签集Ct可能包含任何源标签集中存在的类别,或者包含源标签集并集C中不存在的未知类别。目标是在推理时,如果每个目标样本属于源标签集C,则将其正确分类;否则将其标记为“未知”。请注意,没有目标数据(即使是无标签的)可用于训练,这使得开放领域泛化与领域自适应[51]或领域泛化[49]有所不同。



Figure 2. The architecture of the proposed DAML framework. We show the computation graph for source domain 1 as an example, and the other source domains are computed similarly. In the meta-training (up part, left to right), each source domain is augmented by Dir-mixup (red) and distilled soft-labeling (blue) to compute the L1tr to update the model parameters to F1 and G1 . In the meta-objective (down part, right to left),each source domain is augmented by Dir-mixup (red) to compute the L1obj  to finally update the model parameters.
图2. 所提出的DAML框架的架构。我们以源领域1为例展示计算图,其他源领域的计算方式类似。在元训练(上半部分,从左到右)中,每个源领域通过Dir - mixup(红色)和蒸馏软标签(蓝色)进行增强,以计算L1tr,从而将模型参数更新为F1G1。在元目标(下半部分,从右到左)中,每个源领域通过Dir - mixup(红色)进行增强,以计算L1obj ,最终更新模型参数。


3.2.The DAML Framework
3.2. DAML框架


We propose DAML to address open domain generalization problems to mitigate the disparate label sets and distribution shifts among the diverse source domains. As shown in Algorithm 1, the idea is to learn generalizable representations by meta-learning over augmented domains.
我们提出DAML来解决开放领域泛化问题,以缓解不同源领域之间不同的标签集和分布偏移。如算法1所示,其思路是通过在增强领域上进行元学习来学习可泛化的表示。

Augmented Domains As demonstrated in [53, 17], increasing the diversity of the dataset can substantially improve the generalization of the representations. Motivated by this idea, we augment each domain to expand the diversity of the datasets. We observe that different domains have different distributions and hold different label sets, which means that each domain contains distinct knowledge but lacks domain knowledge and class knowledge of other domains. Based on the observation, we design domain augmentation to address open domain generalization. Our insight is to conduct both feature-level and label-level augmentation. For feature-level augmentation, we propose a novel Dirichlet Mixup (Dir-mixup) method, which augments each domain by the mixup with multiple domains. For label-level augmentation, we propose to augment each domain by distilling soft-labels from models of other domains. The proposed domain augmentation increases the diversity of the data and compensates each domain with missing knowledge of features and classes. The details of the proposed domain augmentation are introduced in Section 3.3.
增强域 正如文献[53, 17]所示,增加数据集的多样性可以显著提高表征的泛化能力。受这一想法的启发,我们对每个域进行增强以扩展数据集的多样性。我们观察到不同的域具有不同的分布并持有不同的标签集,这意味着每个域包含独特的知识,但缺乏其他域的域知识和类别知识。基于这一观察,我们设计了域增强方法来解决开放域泛化问题。我们的思路是同时进行特征级和标签级的增强。对于特征级增强,我们提出了一种新颖的狄利克雷混合(Dir-mixup)方法,该方法通过与多个域进行混合来增强每个域。对于标签级增强,我们提出通过从其他域的模型中提取软标签来增强每个域。所提出的域增强方法增加了数据的多样性,并为每个域补充了缺失的特征和类别知识。所提出的域增强方法的详细信息将在3.3节中介绍。

Meta-Learning We design the learning framework to learn generalizable representations, which simultaneously preserves the unique information of each domain and aggregates the knowledge of all the domains. Thus, instead of employing a shared network for all source domains, which only embeds domain common knowledge, we build one individual classification network composed of a feature extractor Fs and a classifier Gs for each source domain s . Then we need to learn a generalizable representation aggregating the information of all the source domains. We conduct meta-learning over all the networks since meta-learning is demonstrated to be able to learn a generalizable representation from highly disparate domains. In each iteration of the parameter update, we first draw a batch of samples from each domain and compute the corresponding Dir-mixup samples and distilled soft-labels (Line 5-7 in Algorithm 1). Unlike standard meta-learning loss applied only on the raw data [12], with the augmented domains, we design a new meta-training loss as the classification loss on the original data, the domain-augmented data by Dir-mixup, and soft-labels distilled from other domain networks. For each domain s ,let zs=Fs(xs) be the feature of xs ,we define the meta-training loss as
元学习 我们设计了学习框架来学习可泛化的表征,该框架同时保留每个域的独特信息并聚合所有域的知识。因此,我们没有为所有源域采用一个共享网络(该网络仅嵌入域的公共知识),而是为每个源域 s 构建一个由特征提取器 Fs 和分类器 Gs 组成的单独分类网络。然后,我们需要学习一种可泛化的表征,以聚合所有源域的信息。我们对所有网络进行元学习,因为已有研究表明元学习能够从高度不同的域中学习到可泛化的表征。在参数更新的每次迭代中,我们首先从每个域中抽取一批样本,并计算相应的Dir-mixup样本和提取的软标签(算法1中的第5 - 7行)。与仅应用于原始数据的标准元学习损失[12]不同,利用增强后的域,我们将新的元训练损失设计为原始数据、通过Dir-mixup进行域增强后的数据以及从其他域网络中提取的软标签的分类损失。对于每个域 s,设 zs=Fs(xs)xs 的特征,我们将元训练损失定义为

Lstr=E(xs,ys)Ds[k=1|C|(ys)(k)log(Gs(k)(Fs(xs)))]

+E(zsD-mix ,ysD-mix )DsD-mix [k=1|C|(ysD-mix )(k)log(Gs(k)(zsD-mix ))]

+E(xs,ysdistill )Dsdistill [k=1|C|(ysdistill )(k)log(Gs(k)(Fs(xs)))].

(1)

The superscript(k)means the probability of the k -th class. DsD-mix  and Dsdistill  are the augmented domains of Dir-mixup samples and distilled soft-label samples for meta-training on domain s . We compute one step of gradient update for each source network with respect to the meta-training loss: θGs,Fs=θGs,FsηθLstr (Line 9 in Algorithm 1),where η is the step size. The design idea of meta-objective is to guide the gradient update from the meta-training loss to the desired goal. Classic meta-learning employs the losses over all sampled tasks as the meta-objective [12]. But our goal is to improve the generalization ability of the model, so different from classic meta-objective, we design the meta-objective as the classification loss on the original data and Dir-mixup data in other domains with the updated network Gs,Fs ,which can propagate the knowledge of other domains to domain s and promote the knowledge transfer and generalization across domains. The meta-objective is defined as
上标(k)表示第 k 类的概率。DsD-mix Dsdistill  分别是用于在域 s 上进行元训练的Dir-mixup样本和提取的软标签样本的增强域。我们针对元训练损失对每个源网络进行一步梯度更新:θGs,Fs=θGs,FsηθLstr(算法1中的第9行),其中 η 是步长。元目标的设计思路是引导从元训练损失到期望目标的梯度更新。经典的元学习采用所有采样任务的损失作为元目标[12]。但我们的目标是提高模型的泛化能力,因此与经典元目标不同,我们将元目标设计为在原始数据和其他域的Dir-mixup数据上使用更新后的网络 Gs,Fs 的分类损失,这可以将其他域的知识传播到域 s 并促进跨域的知识转移和泛化。元目标定义为
Lsobj =jsE(xj,yj)Dj[k=1|C|(yj)(k)log(Gs(k)(Fs(xj)))]

+E(zsD mix ,ysD mix )DsD mix [k=1|C|(ysD mix )(k)log(Gs(k)(zsD mix ))]

(2)

DsD-mix  is the augmented domain of Dir-mixup samples for domain s in meta-objective. The minimization of the meta-objective finds a gradient descent update that updates the network to classify data in other domains with high accuracy, which encourages the network to learn a generalizable representation performing well across all domains. We finally update the network parameters in one iteration by θsθsβθ(Lstr +Lsobj ) ,where β is the learning rate.
DsD-mix  是元目标中域 s 的Dir-mixup样本的增强域。最小化元目标可以找到一个梯度下降更新,该更新可以更新网络以高精度对其他域的数据进行分类,这鼓励网络学习一种在所有域上都表现良好的可泛化表征。我们最终通过 θsθsβθ(Lstr +Lsobj ) 在一次迭代中更新网络参数,其中 β 是学习率。

3.3. Domain Augmentation
3.3. 领域增强


The meta-learning framework can learn a generalizable representation aggregating information from all source domains, where the generalization power highly relies on the diversity of each source domain. To this end, we design two multiple source domain augmentation approaches: the feature-level augmentation, Dir-mixup, and the label-level augmentation, distilled augmentation. The augmentations compensate for the missing class information in each source domain and further increase domain diversity.
元学习框架可以通过聚合所有源领域的信息来学习可泛化的表示,其泛化能力在很大程度上依赖于每个源领域的多样性。为此,我们设计了两种多源领域增强方法:特征级增强(Dir - mixup)和标签级增强(蒸馏增强)。这些增强方法弥补了每个源领域中缺失的类别信息,并进一步增加了领域多样性。

Dir-mixup Mixup [54] generates a new data-label by the weighted sum of the feature and one-hot label of existing samples, where the weights are sampled from a pre-defined distribution. We augment the s -th source domain by mixup of data in the s -th domain with data in other domains. Since these data may belong to the missing classes of the s -th source domain, mixup augmentation would compensate for the missing classes. Also, mixup produces inter-domain data, which further increases the diversity of data in each domain.
Dir - mixup Mixup [54]通过对现有样本的特征和独热标签进行加权求和来生成新的数据 - 标签对,其中权重是从预定义的分布中采样得到的。我们通过将第s个领域的数据与其他领域的数据进行mixup来增强第s个源领域。由于这些数据可能属于第s个源领域中缺失的类别,因此mixup增强可以弥补这些缺失的类别。此外,mixup会生成跨领域的数据,这进一步增加了每个领域中数据的多样性。

However, the original mixup is defined to mix two samples. When applied to open domain generalization with multiple source domains, mixup samples are only generated from pairs of domains, which, as shown in Figure 3, only generates samples between two domains (the lines between vertex) but lacks samples mixing multiple domains (the whole area). Also, to obtain all domain combinations, such pairwise mixup needs O (#domains × #domains) mixup samples. Therefore, to mix multiple domains, we need to sample the weight from a multi-variate distribution instead of the beta distribution used in the original mixup. We select Dirichlet distribution since it has similar properties to the beta distribution and is a multi-variate distribution. We then design a new Dir-mixup to mix samples (one for each domain) with a designed weight λ sampled from a Dirichlet distribution parameterized by a parameter α . We perform mixup at feature-level. Let z1,z2,,zS be the features of different domain data extracted by the network, the Dir-mixup augmented data (zD-mix ,yD-mix ) can be calculated as:
然而,原始的mixup方法是定义为混合两个样本的。当应用于具有多个源领域的开放领域泛化时,mixup样本仅从成对的领域中生成,如图3所示,这仅生成了两个领域之间的样本(顶点之间的连线),但缺乏混合多个领域的样本(整个区域)。此外,为了获得所有的领域组合,这种成对的mixup需要O(领域数量×领域数量)个mixup样本。因此,为了混合多个领域,我们需要从多元分布中采样权重,而不是使用原始mixup中使用的贝塔分布。我们选择狄利克雷分布(Dirichlet distribution),因为它与贝塔分布具有相似的性质,并且是一种多元分布。然后,我们设计了一种新的Dir - mixup方法,用于以从由参数α参数化的狄利克雷分布中采样得到的设计权重λ混合样本(每个领域一个样本)。我们在特征级别进行mixup。设z1,z2,,zS为网络提取的不同领域数据的特征,则Dir - mixup增强后的数据(zD-mix ,yD-mix )可以计算为:

λDirichlet(α)

(3)(zD-mix ,yD-mix )=(s=1Sλ(s)zs,s=1Sλ(s)ys).




Figure 3. Comparison between Dir-mixup and classic mixup. Classic mixup only mixes two samples, so mixup samples only exist on the edge of the triangle while Dir-mixup mixes samples of multiple domains covering the whole triangle area, meaning Dir-mixup introduce mixup samples with more information and higher diversity.
图3. Dir - mixup与经典mixup的比较。经典mixup仅混合两个样本,因此mixup样本仅存在于三角形的边缘,而Dir - mixup混合多个领域的样本,覆盖了整个三角形区域,这意味着Dir - mixup引入了具有更多信息和更高多样性的mixup样本。


Compared with recent work using mixup for domain generalization [33,49] ,Dir-mixup is more efficient and effective. The parameter α adjusts the distribution to generate different augmentations, better serving the meta-learning process. Consider constructing Dir-mixup for each model s . In the meta-training, we want to keep more information and focus more on domain s during mixup,so we set α(s) larger than other components in α ,which assigns a larger weight λ(s) to zs statistically. In the meta objective,the goal is to transfer knowledge from other domains and improve the cross-domain generalization, which would be enhanced by mixup results with larger domain discrepancy. So we set α(s) smaller than other components in α ,which induces smaller λ(s) statistically. We employ two hyper-parameters αmax and αmin to realize this idea. For the meta-training of model s ,we set αstr to be a length S vector with all entries as αmin but the s -th entry as αmax . We generate mixup data with this αstr to form the Dir-mixup augmentation set in the meta-training of model s ,as DsD-mix  in Equation 1. For the meta-objective,we set αsobj  to be a length S vector with all entries as αmax but the s -th entry as αmin . And the data generated from this αsobj  form the Dir-mixup augmentation set for model s ,which is the DsD-mix’  in Equation 2.
与近期使用混合增强(mixup)进行领域泛化的工作[33,49]相比,狄利克雷混合增强(Dir-mixup)更高效、更有效。参数α调整分布以生成不同的增强数据,能更好地服务于元学习过程。考虑为每个模型s构建狄利克雷混合增强。在元训练中,我们希望保留更多信息,并在混合增强过程中更多地关注领域s,因此我们将α(s)设置为大于α中的其他分量,从统计意义上讲,这会给zs分配更大的权重λ(s)。在元目标中,目标是从其他领域迁移知识并提高跨领域泛化能力,具有更大领域差异的混合增强结果会增强这一能力。因此,我们将α(s)设置为小于α中的其他分量,从统计意义上讲,这会导致更小的λ(s)。我们采用两个超参数αmaxαmin来实现这一想法。对于模型s的元训练,我们将αstr设置为一个长度为S的向量,除第s个元素为αmax外,所有元素均为αmin。我们使用这个αstr生成混合增强数据,以在模型s的元训练中形成狄利克雷混合增强集,如公式1中的DsD-mix 所示。对于元目标,我们将αsobj 设置为一个长度为S的向量,除第s个元素为αmin外,所有元素均为αmax。从这个αsobj 生成的数据形成了模型s的狄利克雷混合增强集,即公式2中的DsD-mix’ 
Distilled Augmentation For the s -th source domain,we further augment it with the soft-labeling distilled from other domains, which is the output predictions of other networks. We mix soft-labels from other domains to increase the diversity of the augmentation. We set the α to be a vector of all ones with dimension S1 since we do not prefer a particular other domain. The augmentation can be defined as
蒸馏增强 对于第s个源领域,我们进一步用从其他领域蒸馏得到的软标签对其进行增强,这些软标签是其他网络的输出预测。我们混合来自其他领域的软标签以增加增强数据的多样性。由于我们不偏好特定的其他领域,我们将α设置为一个维度为S1的全1向量。这种增强可以定义为

λDirichlet(α)

ysdistill =j=1s1λ(j)Gj(Fj(xs))+j=s+1Sλ(j1)Gj(Fj(xs)).

(4)

The soft-label indicates the decision of the networks of other domains on the s -th domain data,which transfers the knowledge from other domains to the s -th domain. The augmentation is reflected as the third term in Equation 1, where we do not back-propagate through Fj,Gj since they are just used to generate the soft-labeling. The augmentation regularizes the s -th domain network with knowledge of other domains, which derives a more generalizable representation.
软标签表示其他领域的网络对第s个领域数据的决策,它将其他领域的知识迁移到第s个领域。这种增强在公式1中表现为第三项,由于Fj,Gj仅用于生成软标签,我们不会通过它们进行反向传播。这种增强利用其他领域的知识对第s个领域的网络进行正则化,从而得到更具泛化能力的表示。

3.4. Inference
3.4. 推理


In the inference stage, we have the networks for all source domains G1,,GS,F1,,FS trained by the DAML framework as shown in Algorithm 1. For a test sample xt from the target domain Dt ,we compute the raw prediction of xt by aggregating the predictions of all the source networks:
在推理阶段,我们拥有如算法1所示的由DAML框架训练的所有源领域G1,,GS,F1,,FS的网络。对于来自目标领域Dt的测试样本xt,我们通过聚合所有源网络的预测来计算xt的原始预测:

(5)y^t=1Ss=1SGs(Fs(xt)).

The ensemble of all source domain networks naturally calibrates the prediction confidence and enables DAML to achieve higher performance in the unseen target domain.
所有源领域网络的集成自然地校准了预测置信度,并使DAML在未见的目标领域中实现更高的性能。

4. Experiments
4. 实验


We construct several open domain generalization scenarios with different datasets to evaluate the proposed method.
我们使用不同的数据集构建了几个开放领域泛化场景,以评估所提出的方法。

4.1. Datasets
4.1. 数据集


PACS dataset [26] consists of four domains corresponding to four different image styles, including photo(P),art-painting(A),cartoon(C)and sketch(S). The four domains have the same label set of 7 classes. We use each domain as the target domain and the other three domains as source domains to form four cross-domain tasks. We evaluate the generalization performance on both the original closed-set dataset and the modified open-domain dataset.
PACS数据集 [26] 由对应四种不同图像风格的四个领域组成,包括照片(P)、艺术绘画(A)、卡通(C)和素描(S)。这四个领域具有相同的7类标签集。我们将每个领域作为目标领域,其他三个领域作为源领域,形成四个跨领域任务。我们在原始的封闭集数据集和修改后的开放域数据集上评估泛化性能。

Office-Home [46] comprises of images from four different domains: Artistic (Ar), Clip art (Cl), Product (Pr) and Real-world(Rw). It has a large domain gap and 65 classes which is much more than other DG datasets, so it is very challenging. We spread these 65 classes among the four domains to derive an open-domain dataset. We construct four open generalization tasks based on it, where each domain is used as the target domain respectively, and the other three domains serve as source domains.
Office - Home数据集 [46] 包含来自四个不同领域的图像:艺术(Ar)、剪贴画(Cl)、产品(Pr)和现实世界(Rw)。它存在较大的领域差距,并且有65个类别,远多于其他领域泛化(DG)数据集,因此极具挑战性。我们将这65个类别分布在四个领域中,以得到一个开放域数据集。基于此,我们构建了四个开放泛化任务,其中每个领域分别作为目标领域,其他三个领域作为源领域。

Multi-Datasets scenario is constructed in this paper to consider a more realistic situation of learning generalizable representations from arbitrary source domains. We simulate the process where we obtain source domains from different resources and try to learn a generalizable model to achieve high accuracy on an unseen target domain. We leverage several public datasets including Office-31 [40], STL-10 [7] and Visda2017 [37] as source domains, and evaluate the generalization performance on four domains in Domain-Net [36]. There exist distribution discrepancy and huge label-set disparity across the four datasets, which forms a natural open domain generalization scenario. Since there are too many open classes in the DomainNet, we preserve all the classes existing in the joint label set of source domains and subsample 20 open classes.
本文构建了多数据集场景,以考虑从任意源领域学习可泛化表示的更现实情况。我们模拟从不同资源获取源领域,并尝试学习一个可泛化模型,以在未见的目标领域上实现高精度的过程。我们利用包括Office - 31 [40]、STL - 10 [7] 和Visda2017 [37] 在内的几个公共数据集作为源领域,并在Domain - Net [36] 的四个领域上评估泛化性能。这四个数据集之间存在分布差异和巨大的标签集差异,形成了一个自然的开放域泛化场景。由于DomainNet中有太多开放类别,我们保留源领域联合标签集中存在的所有类别,并对20个开放类别进行子采样。

4.2. Closed-Set Generalization
4.2. 封闭集泛化


We evaluate the classification accuracy of closed-set generalization on the widely-used domain generalization dataset PACS. The closed-set setting exactly matches the domain generalization setting so we compare with supervised learning on the merged datasets of all source domains: AGG, domain generalization methods including domain-invariant feature learning based methods: CIDDG [29], CSD [38] and DMG [5], meta-learning based methods: MLDG [25], MetaReg [1], MASF [10] and Epi-FCR [27], and augmentation based methods: CrossGrad [43], JiGen [3] and CuMix [33]. We do not compare with domain adaptation methods since they need unlabeled target data.
我们在广泛使用的领域泛化数据集PACS上评估封闭集泛化的分类准确率。封闭集设置与领域泛化设置完全匹配,因此我们将其与所有源领域合并数据集上的监督学习进行比较:AGG;领域泛化方法,包括基于领域不变特征学习的方法:CIDDG [29]、CSD [38] 和DMG [5],基于元学习的方法:MLDG [25]、MetaReg [1]、MASF [10] 和Epi - FCR [27],以及基于增强的方法:CrossGrad [43]、JiGen [3] 和CuMix [33]。我们不与领域自适应方法进行比较,因为它们需要无标签的目标数据。

As shown in Table 4, on the closed-set generalization setting, to which previous domain generalization methods are tailored, DAML still outperforms all previous methods on average and achieves at least comparable performance on all the tasks. In particular, DAML outperforms state-of-the-art meta-learning-based DG, which indicates the importance of domain augmentation to learn generalizable representations. DAML surpasses state-of-the-art augmentation-based DG, indicating that the meta-learning paradigm and the carefully designed feature-level and label-level augmentations can enable learning more generalizable representations.
如表4所示,在封闭集泛化设置下(之前的领域泛化方法是针对该设置设计的),DAML平均仍优于所有先前的方法,并且在所有任务上至少实现了相当的性能。特别是,DAML优于最先进的基于元学习的领域泛化方法,这表明领域增强对于学习可泛化表示的重要性。DAML超越了最先进的基于增强的领域泛化方法,这表明元学习范式以及精心设计的特征级和标签级增强可以使模型学习到更具泛化性的表示。

4.3. Open Domain Generalization
4.3. 开放域泛化


We evaluate the generalization performance for situations where the source and target domains have different label sets and open classes exist. We conduct experiments on PACS, Office-Home, and Multi-Datasets. For PACS and Office-Home, we preserve different parts of classes in the source domains and the target domain to create disparate label sets among source domains and between the source and target domains. For Multi-Datasets, we preserve all the classes for all source datasets. We show the class split in each domain in the supplementary materials. We follow [51] to set a threshold on the prediction confidence and label samples with a confidence lower than the threshold as an open class: "unknown". For the evaluation metric, we report the accuracy of data from non-open classes (Acc) and also follow the state-of-the-art universal domain adaptation paper [13] to use H-score to evaluate performance over all target data.
我们评估源领域和目标领域具有不同标签集且存在开放类别的情况下的泛化性能。我们在PACS、Office - Home和多数据集上进行实验。对于PACS和Office - Home,我们在源领域和目标领域中保留不同部分的类别,以在源领域之间以及源领域和目标领域之间创建不同的标签集。对于多数据集,我们为所有源数据集保留所有类别。我们在补充材料中展示了每个领域的类别划分。我们遵循文献 [51] 的方法,对预测置信度设置一个阈值,并将置信度低于该阈值的样本标记为开放类别:“未知”。对于评估指标,我们报告非开放类别数据的准确率(Acc),并遵循最先进的通用领域自适应论文 [13] 的方法,使用H分数来评估所有目标数据的性能。

Table 2. Results of PACS dataset under the open-domain setting.
表2. 开放域设置下PACS数据集的结果。

MethodArtSketchPhotoCartoonAvg
AccH-scoreAccH-scoreAccH-scoreAccH-scoreAccH-score
AGG51.3538.8749.7547.0953.1544.1966.4348.9855.17±0.1644.78±0.33
MLDG [25]44.5931.5451.2949.9162.2043.3571.6455.2057.43±0.1445.00±0.31
FC [30]51.1239.0151.1549.2860.9445.7969.3252.6758.13±0.2046.69±0.25
Epi-FCR [27]54.1641.1646.3546.1470.0348.3872.0058.1960.64±0.2248.47±0.29
PAR [48]52.9739.2153.6252.0051.8636.5367.7752.0556.56±0.5144.95±0.57
RSC [21]50.4738.4350.1744.5967.5349.8267.5147.3558.92±0.4645.05±0.60
CuMix [33]53.8538.6737.7028.7165.6749.2874.1647.5357.85±0.3241.05±0.66
DAML (ours)54.1043.0258.5056.7375.6953.2973.6554.4765.49±0.3651.88±0.42
Method艺术素描照片卡通Avg
准确率(Accuracy)H分数准确率(Accuracy)H分数准确率(Accuracy)H分数准确率(Accuracy)H分数准确率(Accuracy)H分数
聚合(Aggregation)51.3538.8749.7547.0953.1544.1966.4348.9855.17±0.1644.78±0.33
元学习领域泛化(Meta-Learning Domain Generalization,MLDG) [25]44.5931.5451.2949.9162.2043.3571.6455.2057.43±0.1445.00±0.31
特征对比(Feature Contrast,FC) [30]51.1239.0151.1549.2860.9445.7969.3252.6758.13±0.2046.69±0.25
情景特征对比正则化(Episodic Feature Contrast Regularization,Epi - FCR) [27]54.1641.1646.3546.1470.0348.3872.0058.1960.64±0.2248.47±0.29
渐进式对抗正则化(Progressive Adversarial Regularization,PAR) [48]52.9739.2153.6252.0051.8636.5367.7752.0556.56±0.5144.95±0.57
随机风格组合(Random Style Composition,RSC) [21]50.4738.4350.1744.5967.5349.8267.5147.3558.92±0.4645.05±0.60
类别混合(Class - wise Mixup,CuMix) [33]53.8538.6737.7028.7165.6749.2874.1647.5357.85±0.3241.05±0.66
领域自适应元学习(Domain Adaptive Meta - Learning,DAML)(我们的方法)54.1043.0258.5056.7375.6953.2973.6554.4765.49±0.3651.88±0.42

Table 3. Results of Office-Home dataset under the open-domain setting.
表3. 开放领域设置下Office-Home数据集的结果。

MethodClipartReal-WorldProductArtAvg
AccH-scoreAccH-scoreAccH-scoreAccH-scoreAccH-score
AGG42.8344.9862.4053.6754.2750.1142.2240.8750.43±0.3247.41±0.53
MLDG [25]41.8241.2662.9855.8456.8952.2542.5840.9751.07±0.1947.58±0.42
FC [30]41.8041.6563.7955.1654.4152.0244.1343.2551.03±0.2448.02±0.57
Epi-FCR [27]37.1342.0562.6054.7354.9552.6846.3344.4650.25±0.5048.48±0.76
PAR [48]41.2741.7765.9857.6055.3754.1342.4042.6251.26±0.2749.03±0.41
RSC [21]38.6038.3960.8553.7354.6154.6644.1944.7749.56±0.4447.89±0.79
CuMix [33]41.5443.0764.6358.0257.7455.7942.7640.7251.67±0.1249.40±0.27
DAML (ours)45.1343.1265.9960.1361.5459.0053.1351.1156.45±0.2153.34±0.45
Method剪贴画现实世界产品艺术Avg
准确率(Accuracy)H分数准确率(Accuracy)H分数准确率(Accuracy)H分数准确率(Accuracy)H分数准确率(Accuracy)H分数
聚合(Aggregation)42.8344.9862.4053.6754.2750.1142.2240.8750.43±0.3247.41±0.53
元学习领域泛化(Meta-Learning Domain Generalization,MLDG) [25]41.8241.2662.9855.8456.8952.2542.5840.9751.07±0.1947.58±0.42
特征对比(Feature Contrast,FC) [30]41.8041.6563.7955.1654.4152.0244.1343.2551.03±0.2448.02±0.57
流行病特征对比正则化(Epidemic Feature Contrast Regularization,Epi - FCR) [27]37.1342.0562.6054.7354.9552.6846.3344.4650.25±0.5048.48±0.76
渐进式对抗正则化(Progressive Adversarial Regularization,PAR) [48]41.2741.7765.9857.6055.3754.1342.4042.6251.26±0.2749.03±0.41
随机风格组合(Random Style Composition,RSC) [21]38.6038.3960.8553.7354.6154.6644.1944.7749.56±0.4447.89±0.79
混合裁剪(CutMix with Uncertainty,CuMix) [33]41.5443.0764.6358.0257.7455.7942.7640.7251.67±0.1249.40±0.27
领域自适应元学习(Domain Adaptive Meta - Learning,DAML)(我们的方法)45.1343.1265.9960.1361.5459.0053.1351.1156.45±0.2153.34±0.45

Table 4. Results on closed-set PACS dataset.
表4. 封闭集PACS数据集上的结果。

MethodASPCAvg
AGG77.670.394.473.979.1
CIDDG [29]82.074.894.674.481.4
MLDG [25]79.571.594.377.380.7
CrossGrad [43]78.765.194.073.377.8
MetaReg [1]79.572.294.375.480.4
JiGen [3]79.471.496.075.380.4
MASF [10]80.371.794.577.281.0
Epi-FCR [27]82.173.093.977.081.5
CSD [38]79.872.595.575.080.7
DMG [5]76.975.293.480.481.5
CuMix [33]82.372.695.176.581.6
DAML83.074.195.678.182.7
MethodASPCAvg
AGG77.670.394.473.979.1
因果不变分布域泛化(CIDDG) [29]82.074.894.674.481.4
元学习域泛化(MLDG) [25]79.571.594.377.380.7
交叉梯度(CrossGrad) [43]78.765.194.073.377.8
元正则化(MetaReg) [1]79.572.294.375.480.4
拼图生成(JiGen) [3]79.471.496.075.380.4
多尺度特征融合(MASF) [10]80.371.794.577.281.0
流行病特征对比正则化(Epi - FCR) [27]82.173.093.977.081.5
对比样本蒸馏(CSD) [38]79.872.595.575.080.7
动态元图(DMG) [5]76.975.293.480.481.5
混合裁剪(CuMix) [33]82.372.695.176.581.6
深度自适应元学习(DAML)83.074.195.678.182.7


For the open-domain classification setting, we mainly compare with previous methods that are less influenced by the different label sets of source domains. We select state-of-the-art meta-learning-based and augmentation-based DG methods [25,27,33] ,heterogeneous domain generalization methods: FC [30], recently proposed methods of learning robust and generalizable features: PAR [48] and RSC [21].
对于开放域分类设置,我们主要与那些受源域不同标签集影响较小的先前方法进行比较。我们选择了最先进的基于元学习和基于增强的领域泛化(Domain Generalization,DG)方法[25,27,33]、异构领域泛化方法:FC [30],以及最近提出的学习鲁棒且可泛化特征的方法:PAR [48]和RSC [21]。

As shown in Tables 2, 3 and 5, we can observe that DAML outperforms all the compared methods with a large margin on both Acc and H-score, which indicates that DAML not only learns a generalizable representation for non-open classes but also detects open classes with higher accuracy. In particular, DAML outperforms the meta-learning-based DG methods MLDG and Epi-FCR on almost all the tasks, especially the H-score, which demonstrates that domain augmentation, compensating missing labels for each domain, is vital to addressing the different label sets across source domains. DAML outperforms CuMix, which also employs mixup for data augmentation. Note that we design the Dir-mixup to mix samples from multiple domains while CuMix mixes two arbitrary samples. So our Dir-mixup creates mixup samples with higher variations and diversity, which encourages the model to learn more generalizable representations.
如表2、表3和表5所示,我们可以观察到,DAML在准确率(Acc)和H分数上都大幅优于所有对比方法,这表明DAML不仅能为非开放类学习到可泛化的表示,还能更准确地检测开放类。特别是,DAML在几乎所有任务上都优于基于元学习的DG方法MLDG和Epi - FCR,尤其是在H分数上,这表明领域增强(为每个领域补偿缺失的标签)对于解决源域间不同的标签集问题至关重要。DAML优于同样采用混合增强(mixup)进行数据增强的CuMix。值得注意的是,我们设计的Dir - mixup用于混合来自多个领域的样本,而CuMix混合的是任意两个样本。因此,我们的Dir - mixup创建的混合样本具有更高的变化性和多样性,这促使模型学习到更具泛化性的表示。

The Multi-Datasets simulates the real-world scenario where we aim to generalize from datasets available at hand to an unseen domain. The different source domains hold extremely disparate label sets. In this realistic scenario, DAML outperforms all the compared methods with a large margin, indicating that DAML can be applied to realistic generalization problems and achieve higher performance.
多数据集模拟了现实场景,在该场景中,我们的目标是从手头可用的数据集泛化到一个未见的领域。不同的源域拥有极为不同的标签集。在这种现实场景下,DAML大幅优于所有对比方法,这表明DAML可以应用于现实的泛化问题并取得更高的性能。

4.4. Analysis
4.4. 分析


Ablation Study We go deeper into the DAML framework to explore the efficacy of each module in DAML including meta-learning, Dir-mixup and distilled soft-labels. As shown in Table 6, DsD-mix  means whether to use the Dir-mixup data in the meta-training loss, i.e. whether to use the second term in Equation 1. DsD-mix  means whether to use the Dir-mixup data in the meta-objective loss, i.e. whether to use the second term in Equation 2. Dsmix  means using classic mixup which mixes two arbitrary samples. Dsdistill  means whether to use the distilled soft-label, i.e. whether to use the third term in Equation 1. w/ Meta means whether to use meta-learning or otherwise supervised learning on the augmented domains.
消融研究 我们深入研究DAML框架,以探索DAML中每个模块(包括元学习、Dir - mixup和蒸馏软标签)的有效性。如表6所示,DsD-mix 表示是否在元训练损失中使用Dir - mixup数据,即是否使用公式1中的第二项。DsD-mix 表示是否在元目标损失中使用Dir - mixup数据,即是否使用公式2中的第二项。Dsmix 表示使用经典的混合增强(mixup),即混合任意两个样本。Dsdistill 表示是否使用蒸馏软标签,即是否使用公式1中的第三项。w/ Meta表示是否使用元学习,否则在增强域上进行监督学习。

Table 5. Results on the Multi-Datasets scenario (naturally under the open-domain setting).
表5. 多数据集场景(自然处于开放域设置下)的结果。

MethodClipartRealPaintingSketchAvg
AccH-scoreAccH-scoreAccH-scoreAccH-scoreAccH-score
AGG29.7834.0665.3364.7244.3051.0427.5935.4141.75±0.6346.31±0.57
MLDG [25]29.6635.1165.3754.4044.0450.5326.8334.5741.48±0.6843.65±0.71
FC [30]29.9135.4264.7763.6544.1350.0728.5634.1041.84±0.7345.81±0.69
Epi-FCR [27]27.7037.6260.3164.9539.5750.2426.7633.7438.59±1.1346.64±0.95
PAR [48]29.2939.9964.0962.5942.3646.3730.2139.9641.49±0.6347.23±0.55
RSC [21]27.5734.9860.3660.0237.7642.2126.2130.4437.98±0.7741.91±1.28
CuMix [33]30.0340.1864.6165.0744.3748.7029.7233.7042.18±0.4546.91±0.40
DAML (ours)37.6244.2766.5467.8047.8052.9334.4841.8246.61±0.5951.71±0.52
Method剪贴画真实绘画素描Avg
准确率(Acc)H分数(H-score)准确率(Acc)H分数(H-score)准确率(Acc)H分数(H-score)准确率(Acc)H分数(H-score)准确率(Acc)H分数(H-score)
聚合(AGG)29.7834.0665.3364.7244.3051.0427.5935.4141.75±0.6346.31±0.57
元学习领域泛化(MLDG [25])29.6635.1165.3754.4044.0450.5326.8334.5741.48±0.6843.65±0.71
特征对比(FC [30])29.9135.4264.7763.6544.1350.0728.5634.1041.84±0.7345.81±0.69
流行病特征对比正则化(Epi-FCR [27])27.7037.6260.3164.9539.5750.2426.7633.7438.59±1.1346.64±0.95
渐进式对抗正则化(PAR [48])29.2939.9964.0962.5942.3646.3730.2139.9641.49±0.6347.23±0.55
随机风格组合(RSC [21])27.5734.9860.3660.0237.7642.2126.2130.4437.98±0.7741.91±1.28
类别混合(CuMix [33])30.0340.1864.6165.0744.3748.7029.7233.7042.18±0.4546.91±0.40
领域自适应元学习(DAML (ours))37.6244.2766.5467.8047.8052.9334.4841.8246.61±0.5951.71±0.52



Figure 4. The Fréchet distance between each source domain and the target domain for the four generalization tasks on Office-Home dataset.
图4. Office-Home数据集上四个泛化任务中每个源域与目标域之间的弗雷歇距离(Fréchet distance)。

Table 6. Ablation study on the open-domain Office-Home dataset.
表6. 开放域Office-Home数据集上的消融研究。

DsD-mix DsD-mixDsmixDsdistill w/ MetaClRwPrArAvg
----42.264.857.649.653.6
---43.864.957.151.754.4
---43.865.758.252.455.0
--44.865.959.752.955.9
--44.165.159.752.255.3
--44.365.359.051.955.1
-45.166.061.553.156.5
DsD-mix DsD-mixDsmixDsdistill 与Meta合作氯(Cl)RwPrArAvg
----42.264.857.649.653.6
---43.864.957.151.754.4
---43.865.758.252.455.0
--44.865.959.752.955.9
--44.165.159.752.255.3
--44.365.359.051.955.1
-45.166.061.553.156.5


In Table 6,we observe that using both DsD-mix  and DsD-mix  outperforms using only DsD-mix  and using only DsD-mix  , which indicates Dir-mixup samples are helpful in both meta-training and meta-objective losses. Changing the Dir-mixup to classic mixup drops the accuracy, which shows the importance of a built-in mixup for multiple domains. Using Dsdistill  outperforms not using Dsdistill  on average,indicating that transferring knowledge between domains by distilled soft-labels learns more generalizable representations. DAML outperforms meta-learning conducted on the raw domain without any domain augmentation, which indicates the importance of domain augmentation to address the different label sets of source domains. DAML also outperforms the variant that uses no meta-learning, which demonstrates that meta-learning can aggregate knowledge from augmented source domains in a more effective way.
在表6中,我们观察到同时使用DsD-mix DsD-mix 的效果优于仅使用DsD-mix 和仅使用DsD-mix ,这表明狄利克雷混合(Dir-mixup)样本在元训练和元目标损失中均有帮助。将狄利克雷混合改为经典混合会降低准确率,这显示了针对多个领域内置混合方法的重要性。平均而言,使用Dsdistill 的效果优于不使用Dsdistill ,这表明通过蒸馏软标签在不同领域之间转移知识能够学习到更具泛化性的表示。域增强元学习(DAML)方法的表现优于在未进行任何领域增强的原始领域上进行的元学习,这表明领域增强对于处理源领域不同标签集的重要性。DAML也优于不使用元学习的变体,这表明元学习能够更有效地聚合增强源领域的知识。

Fréchet Distance We compare the domain gap between source and target domains on features learned by the baseline AGG model and features learned by the DAML model. We extract features of each domain and compute their mean vectors and covariance matrices. Then we evaluate the Fréchet Distance[11] between the features of each source domain and the non-open class part of the target domain. As shown in Figure 4, the domain gaps between source domains and the unseen target domain are smaller in DAML, indicating that DAML learns more generalizable representations.
弗雷歇距离(Fréchet Distance) 我们比较了基线聚合(AGG)模型学习到的特征和DAML模型学习到的特征在源领域和目标领域之间的域差距。我们提取每个领域的特征并计算它们的均值向量和协方差矩阵。然后,我们评估每个源领域的特征与目标领域非开放类部分之间的弗雷歇距离[11]。如图4所示,在DAML中,源领域与未见目标领域之间的域差距更小,这表明DAML学习到了更具泛化性的表示。

5. Conclusion
5. 结论


In this paper, we propose a new open domain generalization problem aiming to generalize from arbitrary source domains with disparate label sets to unseen target domains, which can be widely utilized in real-world applications. We further propose a novel Domain-Augmented Meta-Learning framework (DAML) to address the problem, which conducts meta-learning over domains augmented at feature-level by specially designed Dir-mixup and at label-level by distilled soft-labels. Extensive experiments demonstrate that DAML learns more generalizable representations for classification in the target domain than the previous generalization methods.
在本文中,我们提出了一个新的开放域泛化问题,旨在从具有不同标签集的任意源领域泛化到未见目标领域,该问题可广泛应用于实际应用中。我们进一步提出了一种新颖的域增强元学习框架(DAML)来解决该问题,该框架通过专门设计的狄利克雷混合在特征级别和通过蒸馏软标签在标签级别对增强后的领域进行元学习。大量实验表明,与以往的泛化方法相比,DAML在目标领域的分类任务中学习到了更具泛化性的表示。

Acknowledgments
致谢


This work was supported by National Key R&D Program of China (2020AAA0109201), NSFC grants (62022050, 62021002, 61772299, 71690231), Beijing Nova Program (Z201100006820041), and MOE Innovation Plan of China.
本工作得到了国家重点研发计划(2020AAA0109201)、国家自然科学基金(62022050、62021002、61772299、71690231)、北京市科技新星计划(Z201100006820041)和教育部创新计划的支持。

References
参考文献


[1] Yogesh Balaji, Swami Sankaranarayanan, and Rama Chel-lappa. Metareg: Towards domain generalization using meta-regularization. In Advances in Neural Information Processing Systems (NeurIPS), pages 998-1008, 2018. 2, 6, 7
[1] Yogesh Balaji、Swami Sankaranarayanan和Rama Chel-lappa。元正则化(Metareg):使用元正则化实现领域泛化。《神经信息处理系统进展》(NeurIPS),第998 - 1008页,2018年。2、6、7

[2] Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Michael I. Jordan. Partial transfer learning with selective adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018. 2
[2] 曹张杰、龙明盛、王建民和Michael I. Jordan。使用选择性对抗网络进行部分迁移学习。《电气与电子工程师协会计算机视觉与模式识别会议》(CVPR),2018年6月。2

[3] Fabio Maria Carlucci, Antonio D'Innocente, Silvia Bucci, Barbara Caputo, and Tatiana Tommasi. Domain generalization by solving jigsaw puzzles. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. 3, 6, 7
[3] Fabio Maria Carlucci、Antonio D'Innocente、Silvia Bucci、Barbara Caputo和Tatiana Tommasi。通过解决拼图难题实现领域泛化。《电气与电子工程师协会计算机视觉与模式识别会议》(CVPR),2019年。3、6、7

[4] Fabio M Carlucci, Paolo Russo, Tatiana Tommasi, and Barbara Caputo. Agnostic domain generalization. arXiv preprint arXiv:1808.01102, 2018. 2
[4] Fabio M Carlucci、Paolo Russo、Tatiana Tommasi和Barbara Caputo。不可知领域泛化。预印本arXiv:1808.01102,2018年。2

[5] Prithvijit Chattopadhyay, Yogesh Balaji, and Judy Hoffman. Learning to balance specificity and invariance for in and out of domain generalization. In European Conference in Computer Vision (ECCV), 2020. 2, 6, 7
[5] Prithvijit Chattopadhyay、Yogesh Balaji和Judy Hoffman。学习平衡特异性和不变性以实现领域内和领域外泛化。《欧洲计算机视觉会议》(ECCV),2020年。2、6、7

[6] Dan Ciregan, Ueli Meier, and Jürgen Schmidhuber. Multi-column deep neural networks for image classification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3642-3649. IEEE, 2012. 3
[6] Dan Ciregan、Ueli Meier和Jürgen Schmidhuber。用于图像分类的多列深度神经网络。《电气与电子工程师协会计算机视觉与模式识别会议》(CVPR),第3642 - 3649页。电气与电子工程师协会,2012年。3

[7] Adam Coates, Andrew Ng, and Honglak Lee. An analysis of single-layer networks in unsupervised feature learning. In International Conference on Artificial Intelligence and Statistics (AISTATS), pages 215-223, 2011. 6
[7] Adam Coates、Andrew Ng和Honglak Lee。无监督特征学习中单层网络的分析。《人工智能与统计国际会议》(AISTATS),第215 - 223页,2011年。6

[8] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 248-255. Ieee, 2009. 1
[8] 贾·邓(Jia Deng)、魏东(Wei Dong)、理查德·索舍尔(Richard Socher)、李佳·李(Li-Jia Li)、李开复(Kai Li)和李菲菲(Li Fei-Fei)。ImageNet:一个大规模分层图像数据库。见《电气与电子工程师协会计算机视觉与模式识别会议论文集》(IEEE Conference on Computer Vision and Pattern Recognition,CVPR),第248 - 255页。电气与电子工程师协会(IEEE),2009年。1

[9] Terrance DeVries and Graham W Taylor. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552, 2017. 3
[9] 特伦斯·德弗里斯(Terrance DeVries)和格雷厄姆·W·泰勒(Graham W Taylor)。使用Cutout改进卷积神经网络的正则化。预印本arXiv:1708.04552,2017年。3

[10] Qi Dou, Daniel C. Castro, Konstantinos Kamnitsas, and Ben Glocker. Domain generalization via model-agnostic learning of semantic features. In Advances in Neural Information Processing Systems (NeurIPS), 2019. 2, 6, 7
[10] 窦启(Qi Dou)、丹尼尔·C·卡斯特罗(Daniel C. Castro)、康斯坦丁诺斯·卡姆尼察斯(Konstantinos Kamnitsas)和本·格洛克(Ben Glocker)。通过语义特征的模型无关学习实现领域泛化。见《神经信息处理系统进展》(Advances in Neural Information Processing Systems,NeurIPS),2019年。2, 6, 7

[11] DC Dowson and BV Landau. The fréchet distance between multivariate normal distributions. Journal of Multivariate Analysis, 12(3):450-455, 1982. 8
[11] D.C. 道森(DC Dowson)和B.V. 兰道(BV Landau)。多元正态分布之间的弗雷歇距离(Fréchet distance)。《多元分析杂志》(Journal of Multivariate Analysis),12(3):450 - 455,1982年。8

[12] Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning (ICML), volume 70, pages 1126-1135, 2017. 4, 5
[12] 切尔西·芬恩(Chelsea Finn)、彼得·阿贝贝尔(Pieter Abbeel)和谢尔盖·列维(Sergey Levine)。用于深度网络快速适应的模型无关元学习。见《国际机器学习会议论文集》(International Conference on Machine Learning,ICML),第70卷,第1126 - 1135页,2017年。4, 5

[13] Bo Fu, Zhangjie Cao, Mingsheng Long, and Jianmin Wang. Learning to detect open classes for universal domain adaptation. In European Conference on Computer Vision (ECCV), August 2020. 7
[13] 傅博(Bo Fu)、曹张杰(Zhangjie Cao)、龙明盛(Mingsheng Long)和王建民(Jianmin Wang)。学习检测开放类以实现通用领域自适应。见《欧洲计算机视觉会议论文集》(European Conference on Computer Vision,ECCV),2020年8月。7

[14] Yaroslav Ganin and Victor S. Lempitsky. Unsupervised domain adaptation by backpropagation. In Proceedings of the 32nd International Conference on Machine Learning, International Conference on Machine Learning (ICML) 2015, Lille, France, 6-11 July 2015, pages 1180-1189, 2015. 2
[14] 亚罗斯拉夫·加宁(Yaroslav Ganin)和维克多·S·伦皮茨基(Victor S. Lempitsky)。通过反向传播进行无监督领域自适应。见《第32届国际机器学习会议论文集》(Proceedings of the 32nd International Conference on Machine Learning,ICML),2015年7月6 - 11日,法国里尔,第1180 - 1189页,2015年。2

[15] Muhammad Ghifary, David Balduzzi, W. Bastiaan Kleijn, and Mengjie Zhang. Scatter component analysis: A unified frame-
[15] 穆罕默德·吉法里(Muhammad Ghifary)、大卫·巴尔杜齐(David Balduzzi)、W. 巴斯蒂安·克莱因(W. Bastiaan Kleijn)和张梦洁(Mengjie Zhang)。散度分量分析:

work for domain adaptation and domain generalization. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 39(7):1414-1430, 2017. 2
一种用于领域自适应和领域泛化的统一框架。《电气与电子工程师协会模式分析与机器智能汇刊》(IEEE Transactions on Pattern Analysis and Machine Intelligence,TPAMI),39(7):1414 - 1430,2017年。2

[16] Muhammad Ghifary, W Bastiaan Kleijn, Mengjie Zhang, and David Balduzzi. Domain generalization for object recognition with multi-task autoencoders. In IEEE International Conference on Computer Vision (ICCV), pages 2551-2559, 2015. 2
[16] 穆罕默德·吉法里(Muhammad Ghifary)、W. 巴斯蒂安·克莱因(W Bastiaan Kleijn)、张梦洁(Mengjie Zhang)和大卫·巴尔杜齐(David Balduzzi)。使用多任务自编码器进行目标识别的领域泛化。见《电气与电子工程师协会国际计算机视觉会议论文集》(IEEE International Conference on Computer Vision,ICCV),第2551 - 2559页,2015年。2

[17] Zhiqiang Gong, Ping Zhong, and Weidong Hu. Diversity in machine learning. IEEE Access, 7:64323-64350, 2019. 4
[17] 龚志强(Zhiqiang Gong)、钟平(Ping Zhong)和胡卫东(Weidong Hu)。机器学习中的多样性。《电气与电子工程师协会接入期刊》(IEEE Access),7:64323 - 64350,2019年。4

[18] Hongyu Guo, Yongyi Mao, and Richong Zhang. Mixup as locally linear out-of-manifold regularization. In AAAI Conference on Artificial Intelligence (AAAI), volume 33, pages 3714-3722, 2019. 3
[18] 郭宏宇(Hongyu Guo)、毛永毅(Yongyi Mao)和张日崇(Richong Zhang)。Mixup作为局部线性流形外正则化。见《美国人工智能协会会议论文集》(AAAI Conference on Artificial Intelligence,AAAI),第33卷,第3714 - 3722页,2019年。3

[19] Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Gir-shick. Mask r-cnn. In IEEE International Conference on Computer Vision (ICCV), pages 2980-2988. IEEE, 2017. 1
[19] 何恺明(Kaiming He)、乔治娅·吉奥克萨里(Georgia Gkioxari)、皮奥特·多尔(Piotr Dollár)和罗斯·吉里希克(Ross Gir-shick)。Mask R - CNN。见《电气与电子工程师协会国际计算机视觉会议论文集》(IEEE International Conference on Computer Vision,ICCV),第2980 - 2988页。电气与电子工程师协会(IEEE),2017年。1

[20] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 1
[20] 何(K. He)、张(X. Zhang)、任(S. Ren)和孙(J. Sun)。用于图像识别的深度残差学习。见《电气与电子工程师协会计算机视觉与模式识别会议论文集》(IEEE Conference on Computer Vision and Pattern Recognition,CVPR),2016年。1

[21] Zeyi Huang, Haohan Wang, Eric P. Xing, and Dong Huang. Self-challenging improves cross-domain generalization. In European Conference on Computer Vision (ECCV), 2020. 7, 8
[21] 黄泽毅(Zeyi Huang)、王浩涵(Haohan Wang)、埃里克·P·邢(Eric P. Xing)和黄东(Dong Huang)。自我挑战提升跨领域泛化能力。见《欧洲计算机视觉会议论文集》(European Conference on Computer Vision,ECCV),2020年。7, 8

[22] Aditya Khosla, Tinghui Zhou, Tomasz Malisiewicz, Alexei A Efros, and Antonio Torralba. Undoing the damage of dataset bias. In European Conference on Computer Vision (ECCV), pages 158-171. Springer, 2012. 2
[22] 阿迪蒂亚·科斯拉(Aditya Khosla)、周廷辉(Tinghui Zhou)、托马什·马利塞维奇(Tomasz Malisiewicz)、阿列克谢·A·埃弗罗斯(Alexei A Efros)和安东尼奥·托拉尔巴(Antonio Torralba)。消除数据集偏差的影响。见《欧洲计算机视觉会议论文集》(European Conference on Computer Vision,ECCV),第158 - 171页。施普林格出版社(Springer),2012年。2

[23] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Ad - vances in Neural Information Processing Systems (NeurIPS), 2012. 1
[23] A. 克里热夫斯基(A. Krizhevsky)、I. 苏茨克维(I. Sutskever)和 G. E. 辛顿(G. E. Hinton)。利用深度卷积神经网络进行 ImageNet 图像分类。收录于《神经信息处理系统进展》(Advances in Neural Information Processing Systems,NeurIPS),2012 年。1

[24] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Im-agenet classification with deep convolutional neural networks. Communications of the ACM, 60(6):84-90, 2017. 3
[24] 亚历克斯·克里热夫斯基(Alex Krizhevsky)、伊利亚·苏茨克维(Ilya Sutskever)和杰弗里·E·辛顿(Geoffrey E Hinton)。利用深度卷积神经网络进行 ImageNet 图像分类。《美国计算机协会通讯》(Communications of the ACM),60(6):84 - 90,2017 年。3

[25] Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy Hospedales. Learning to generalize: Meta-learning for domain generalization. In AAAI Conference on Artificial Intelligence (AAAI), 2018.1,2,6,7,8
[25] 李达(Da Li)、杨永新(Yongxin Yang)、宋毅哲(Yi - Zhe Song)和蒂莫西·霍斯佩代尔斯(Timothy Hospedales)。学习泛化:用于领域泛化的元学习。收录于《人工智能协会会议》(AAAI Conference on Artificial Intelligence,AAAI),2018 年。1,2,6,7,8

[26] Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M Hospedales. Deeper, broader and artier domain generalization. In IEEE International Conference on Computer Vision (ICCV), pages 5543-5551. IEEE, 2017. 6
[26] 李达(Da Li)、杨永新(Yongxin Yang)、宋毅哲(Yi - Zhe Song)和蒂莫西·M·霍斯佩代尔斯(Timothy M Hospedales)。更深入、更广泛、更具艺术性的领域泛化。收录于《电气与电子工程师协会国际计算机视觉会议》(IEEE International Conference on Computer Vision,ICCV),第 5543 - 5551 页。电气与电子工程师协会(IEEE),2017 年。6

[27] Da Li, Jianshu Zhang, Yongxin Yang, Cong Liu, Yi-Zhe Song, and Timothy M. Hospedales. Episodic training for domain generalization. In IEEE International Conference on Computer Vision (ICCV), October 2019. 2, 6, 7, 8
[27] 李达(Da Li)、张剑树(Jianshu Zhang)、杨永新(Yongxin Yang)、刘聪(Cong Liu)、宋毅哲(Yi - Zhe Song)和蒂莫西·M·霍斯佩代尔斯(Timothy M. Hospedales)。用于领域泛化的情景式训练。收录于《电气与电子工程师协会国际计算机视觉会议》(IEEE International Conference on Computer Vision,ICCV),2019 年 10 月。2, 6, 7, 8

[28] Haoliang Li, Sinno Jialin Pan, Shiqi Wang, and Alex C Kot. Domain generalization with adversarial feature learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. 2
[28] 李浩亮(Haoliang Li)、潘嘉麟(Sinno Jialin Pan)、王诗琪(Shiqi Wang)和亚历克斯·C·科特(Alex C Kot)。通过对抗特征学习实现领域泛化。收录于《电气与电子工程师协会计算机视觉与模式识别会议》(IEEE Conference on Computer Vision and Pattern Recognition,CVPR),2018 年。2

[29] Ya Li, Xinmei Tian, Mingming Gong, Yajing Liu, Tongliang Liu, Kun Zhang, and Dacheng Tao. Deep domain generalization via conditional invariant adversarial networks. In European Conference on Computer Vision (ECCV), pages 624-639, 2018. 1, 2, 6, 7
[29] 李亚(Ya Li)、田新梅(Xinmei Tian)、巩明明(Mingming Gong)、刘雅静(Yajing Liu)、刘同亮(Tongliang Liu)、张坤(Kun Zhang)和陶大成(Dacheng Tao)。通过条件不变对抗网络实现深度领域泛化。收录于《欧洲计算机视觉会议》(European Conference on Computer Vision,ECCV),第 624 - 639 页,2018 年。1, 2, 6, 7

[30] Yiying Li, Yongxin Yang, Wei Zhou, and Timothy M.
[30] 李怡颖(Yiying Li)、杨永新(Yongxin Yang)、周伟(Wei Zhou)和蒂莫西·M.

Hospedales. Feature-critic networks for heterogeneous domain generalization. In International Conference on Machine Learning (ICML), volume 97, pages 3915-3924, 2019. 2, 7, 8
霍斯佩代尔斯(Hospedales)。用于异构领域泛化的特征评判网络。收录于《国际机器学习会议》(International Conference on Machine Learning,ICML),第 97 卷,第 3915 - 3924 页,2019 年。2, 7, 8

[31] M. Long, Y. Cao, J. Wang, and M. I. Jordan. Learning transferable features with deep adaptation networks. In International Conference on Machine Learning (ICML), 2015. 2
[31] M. 龙(M. Long)、Y. 曹(Y. Cao)、J. 王(J. Wang)和 M. I. 乔丹(M. I. Jordan)。利用深度自适应网络学习可迁移特征。收录于《国际机器学习会议》(International Conference on Machine Learning,ICML),2015 年。2

[32] Mingsheng Long, Zhangjie Cao, Jianmin Wang, and Michael I Jordan. Conditional adversarial domain adaptation. In Advances in Neural Information Processing Systems (NeurIPS), pages 1640-1650, 2018. 2
[32] 龙明盛(Mingsheng Long)、曹张杰(Zhangjie Cao)、王建民(Jianmin Wang)和迈克尔·I·乔丹(Michael I Jordan)。条件对抗领域自适应。收录于《神经信息处理系统进展》(Advances in Neural Information Processing Systems,NeurIPS),第 1640 - 1650 页,2018 年。2

[33] Massimiliano Mancini, Zeynep Akata, Elisa Ricci, and Barbara Caputo. Towards recognizing unseen categories in unseen domains. In European Conference on Computer Vision (ECCV), August 2020. 3, 5, 6, 7, 8
[33] 马西米利亚诺·曼奇尼(Massimiliano Mancini)、泽内普·阿卡塔(Zeynep Akata)、伊莉莎·里奇(Elisa Ricci)和芭芭拉·卡普托(Barbara Caputo)。迈向识别未见领域中的未见类别。收录于《欧洲计算机视觉会议》(European Conference on Computer Vision,ECCV),2020 年 8 月。3, 5, 6, 7, 8

[34] Krikamol Muandet, David Balduzzi, and Bernhard Schölkopf. Domain generalization via invariant feature representation. In International Conference on Machine Learning (ICML), pages 10-18, 2013. 1, 2
[34] 克里卡莫尔·穆安代特(Krikamol Muandet)、大卫·巴尔杜齐(David Balduzzi)和伯恩哈德·肖尔科普夫(Bernhard Schölkopf)。通过不变特征表示实现领域泛化。收录于《国际机器学习会议》(International Conference on Machine Learning,ICML),第 10 - 18 页,2013 年。1, 2

[35] Pau Panareda Busto and Juergen Gall. Open set domain adaptation. In IEEE International Conference on Computer Vision (ICCV), Oct 2017. 2
[35] 保罗·帕纳雷达·布斯托(Pau Panareda Busto)和于尔根·加尔(Juergen Gall)。开放集领域自适应。收录于《电气与电子工程师协会国际计算机视觉会议》(IEEE International Conference on Computer Vision,ICCV),2017 年 10 月。2

[36] Xingchao Peng, Qinxun Bai, Xide Xia, Zijun Huang, Kate Saenko, and Bo Wang. Moment matching for multi-source domain adaptation. In IEEE International Conference on Computer Vision (ICCV), pages 1406-1415, 2019. 1, 2, 6
[36] 彭兴超(Xingchao Peng)、白勤勋(Qinxun Bai)、夏西德(Xide Xia)、黄紫军(Zijun Huang)、凯特·塞内科(Kate Saenko)和王博(Bo Wang)。多源领域自适应的矩匹配。收录于《电气与电子工程师协会国际计算机视觉会议》(IEEE International Conference on Computer Vision,ICCV),第 1406 - 1415 页,2019 年。1, 2, 6

[37] Xingchao Peng, Ben Usman, Neela Kaushik, Judy Hoffman, Dequan Wang, and Kate Saenko. Visda: The visual domain adaptation challenge, 2017. 6
[37] 彭兴超(Xingchao Peng)、本·乌斯曼(Ben Usman)、尼拉·考希克(Neela Kaushik)、朱迪·霍夫曼(Judy Hoffman)、王德全(Dequan Wang)和凯特·塞内科(Kate Saenko)。Visda:视觉领域自适应挑战赛,2017 年。6

[38] Vihari Piratla, Praneeth Netrapalli, and Sunita Sarawagi. Efficient domain generalization via common-specific low-rank decomposition. In International Conference on Machine Learning (ICML), volume 119, pages 7728-7738, 2020. 2, 6, 7
[38] 维哈里·皮拉特(Vihari Piratla)、普拉尼特·内特拉帕利(Praneeth Netrapalli)和苏尼塔·萨拉瓦吉(Sunita Sarawagi)。通过通用 - 特定低秩分解实现高效的领域泛化。收录于《机器学习国际会议(ICML)》,第119卷,第7728 - 7738页,2020年。2, 6, 7

[39] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems (NeurIPS), pages 91-99, 2015. 1
[39] 任少卿(Shaoqing Ren)、何恺明(Kaiming He)、罗斯·吉里希克(Ross Girshick)和孙剑(Jian Sun)。Faster R - CNN:利用区域建议网络实现实时目标检测。收录于《神经信息处理系统进展(NeurIPS)》,第91 - 99页,2015年。1

[40] K. Saenko, B. Kulis, M. Fritz, and T. Darrell. Adapting visual category models to new domains. In European Conference on Computer Vision (ECCV), 2010. 6
[40] K. 塞内科(K. Saenko)、B. 库利斯(B. Kulis)、M. 弗里茨(M. Fritz)和T. 达雷尔(T. Darrell)。将视觉类别模型适配到新领域。收录于《欧洲计算机视觉会议(ECCV)》,2010年。6

[41] Kuniaki Saito, Shohei Yamamoto, Yoshitaka Ushiku, and Tatsuya Harada. Open set domain adaptation by backpropa-gation. In European Conference on Computer Vision (ECCV), September 2018. 2
[41] 斋藤邦明(Kuniaki Saito)、山本翔平(Shohei Yamamoto)、牛久义隆(Yoshitaka Ushiku)和原田达也(Tatsuya Harada)。通过反向传播进行开放集领域自适应。收录于《欧洲计算机视觉会议(ECCV)》,2018年9月。2

[42] Ikuro Sato, Hiroki Nishimura, and Kensuke Yokoi. Apac: Augmented pattern classification with neural networks. arXiv preprint arXiv:1505.03229, 2015. 3
[42] 佐藤育郎(Ikuro Sato)、西村宏树(Hiroki Nishimura)和横井健介(Kensuke Yokoi)。APAC:基于神经网络的增强模式分类。预印本arXiv:1505.03229,2015年。3

[43] Shiv Shankar, Vihari Piratla, Soumen Chakrabarti, Siddhartha Chaudhuri, Preethi Jyothi, and Sunita Sarawagi. Generalizing across domains via cross-gradient training. In International Conference on Learning Representations (ICLR), 2018. 3, 6, 7
[43] 希夫·尚卡尔(Shiv Shankar)、维哈里·皮拉特(Vihari Piratla)、苏门·查克拉巴蒂(Soumen Chakrabarti)、悉达多·乔杜里(Siddhartha Chaudhuri)、普里蒂·乔蒂(Preethi Jyothi)和苏尼塔·萨拉瓦吉(Sunita Sarawagi)。通过交叉梯度训练实现跨领域泛化。收录于《学习表征国际会议(ICLR)》,2018年。3, 6, 7

[44] Yuji Tokozume, Yoshitaka Ushiku, and Tatsuya Harada. Between-class learning for image classification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5486-5494, 2018. 3
[44] 床诘雄二(Yuji Tokozume)、牛久义隆(Yoshitaka Ushiku)和原田达也(Tatsuya Harada)。用于图像分类的类间学习。收录于《电气与电子工程师协会计算机视觉与模式识别会议(CVPR)》,第5486 - 5494页,2018年。3

[45] Vladimir Vapnik. The nature of statistical learning theory. Springer science & business media, 2013. 2
[45] 弗拉基米尔·瓦普尼克(Vladimir Vapnik)。统计学习理论的本质。施普林格科学与商业媒体出版社,2013年。2

[46] Hemanth Venkateswara, Jose Eusebio, Shayok Chakraborty, and Sethuraman Panchanathan. Deep hashing network for unsupervised domain adaptation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 6
[46] 赫曼斯·文卡特斯瓦拉(Hemanth Venkateswara)、何塞·尤塞比奥(Jose Eusebio)、沙约克·查克拉博蒂(Shayok Chakraborty)和塞图拉曼·潘查纳坦(Sethuraman Panchanathan)。用于无监督领域自适应的深度哈希网络。收录于《电气与电子工程师协会计算机视觉与模式识别会议(CVPR)》,2017年。6

[47] Riccardo Volpi, Hongseok Namkoong, Ozan Sener, John C Duchi, Vittorio Murino, and Silvio Savarese. Generalizing to unseen domains via adversarial data augmentation. In Ad - vances in Neural Information Processing Systems (NeurIPS), pages 5334-5344, 2018. 3
[47] 里卡多·沃尔皮(Riccardo Volpi)、洪锡南(Hongseok Namkoong)、奥赞·塞纳尔(Ozan Sener)、约翰·C·杜奇(John C Duchi)、维托里奥·穆里诺(Vittorio Murino)和西尔维奥·萨瓦雷塞(Silvio Savarese)。通过对抗性数据增强实现对未见领域的泛化。收录于《Ad - 神经信息处理系统进展(NeurIPS)》,第5334 - 5344页,2018年。3

[48] Haohan Wang, Songwei Ge, Zachary Lipton, and Eric P Xing. Learning robust global representations by penalizing local predictive power. In Advances in Neural Information Processing Systems (NeurIPS), pages 10506-10518, 2019. 7, 8
[48] 王浩涵(Haohan Wang)、葛松伟(Songwei Ge)、扎卡里·利普顿(Zachary Lipton)和埃里克·P·邢(Eric P Xing)。通过惩罚局部预测能力学习鲁棒的全局表征。收录于《神经信息处理系统进展(NeurIPS)》,第10506 - 10518页,2019年。7, 8

[49] Yufei Wang, Haoliang Li, and Alex C Kot. Heterogeneous domain generalization via domain mixup. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 3622-3626. IEEE, 2020. 2, 3, 5
[49] 王宇飞(Yufei Wang)、李昊亮(Haoliang Li)和亚历克斯·C·科特(Alex C Kot)。通过领域混合实现异构领域泛化。收录于《电气与电子工程师协会国际声学、语音和信号处理会议(ICASSP)》,第3622 - 3626页。电气与电子工程师协会,2020年。2, 3, 5

[50] Ruijia Xu, Ziliang Chen, Wangmeng Zuo, Junjie Yan, and Liang Lin. Deep cocktail network: Multi-source unsupervised domain adaptation with category shift. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3964-3973, 2018. 2
[50] 徐瑞佳(Ruijia Xu)、陈子亮(Ziliang Chen)、左旺孟(Wangmeng Zuo)、闫俊杰(Junjie Yan)和林良(Liang Lin)。深度鸡尾酒网络:处理类别偏移的多源无监督领域自适应。收录于《电气与电子工程师协会计算机视觉与模式识别会议(CVPR)》,第3964 - 3973页,2018年。2

[51] Kaichao You, Mingsheng Long, Zhangjie Cao, Jianmin Wang, and Michael I Jordan. Universal domain adaptation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2720-2729, 2019. 2, 3, 7
[51] 游凯超(Kaichao You)、龙明盛(Mingsheng Long)、曹张杰(Zhangjie Cao)、王建民(Jianmin Wang)和迈克尔·I·乔丹(Michael I Jordan)。通用领域自适应。收录于《电气与电子工程师协会计算机视觉与模式识别会议(CVPR)》,第2720 - 2729页,2019年。2, 3, 7

[52] Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, and Youngjoon Yoo. Cutmix: Regularization strategy to train strong classifiers with localizable features. In IEEE International Conference on Computer Vision (ICCV), pages 6023-6032, 2019. 3
[52] 尹相斗(Sangdoo Yun)、韩东允(Dongyoon Han)、吴成俊(Seong Joon Oh)、千相赫(Sanghyuk Chun)、崔俊锡(Junsuk Choe)和柳英俊(Youngjoon Yoo)。CutMix:使用可定位特征训练强分类器的正则化策略。收录于《电气与电子工程师协会国际计算机视觉会议(ICCV)》,第6023 - 6032页,2019年。3

[53] Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. Understanding deep learning requires rethinking generalization. In International Conference on Learning Representations (ICLR), 2017. 4
[53] 张弛原(Chiyuan Zhang)、萨米·本吉奥(Samy Bengio)、莫里茨·哈德特(Moritz Hardt)、本杰明·雷希特(Benjamin Recht)和奥里奥尔·维尼亚尔斯(Oriol Vinyals)。理解深度学习需要重新思考泛化问题。收录于《学习表征国际会议(ICLR)》,2017年。4

[54] Hongyi Zhang, Moustapha Cissé, Yann N. Dauphin, and David Lopez-Paz. mixup: Beyond empirical risk minimization. In International Conference on Learning Representations (ICLR), 2018. 3, 5
[54] 张弘毅(Hongyi Zhang)、穆斯塔法·西塞(Moustapha Cissé)、扬·N·多芬(Yann N. Dauphin)和大卫·洛佩斯 - 帕斯(David Lopez - Paz)。mixup:超越经验风险最小化。发表于国际学习表征会议(International Conference on Learning Representations,ICLR),2018 年。第 3、5 页

[55] Han Zhao, Shanghang Zhang, Guanhang Wu, José MF Moura, Joao P Costeira, and Geoffrey J Gordon. Adversarial multiple source domain adaptation. In Advances in Neural Information Processing Systems (NeurIPS), pages 8559-8570, 2018. 2
[55] 赵晗(Han Zhao)、张上航(Shanghang Zhang)、吴冠航(Guanhang Wu)、何塞·MF·莫拉(José MF Moura)、若昂·P·科斯特拉(Joao P Costeira)和杰弗里·J·戈登(Geoffrey J Gordon)。对抗性多源领域适应。发表于神经信息处理系统进展会议(Advances in Neural Information Processing Systems,NeurIPS),第 8559 - 8570 页,2018 年。第 2 页